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INTELLIGENT HARVESTING AND 
NAVIGATION SYSTEM AND METHOD 

This application claims benefit of provisional application 
Ser. No. 60/160,801 filed Oct. 21, 1999. s 

BACKGROUND OF THE INVENTION 

This invention relates generally to a system and method 
for delivering content to information appliances and in 
particular to a system and method for permitting web pages 10 
in different formats to be communicated to various different 
information appliances having different display footprints. 

Today, people have an unquenchable thirst for informa- 
tion that demands instant access at any time or place to the 
information. There has been explosive growth in the hand- 
held computing market and in the cell phone market with 
over 300 million users worldwide. In addition, there has 
been a continued increase in the number of people with 
access to the Internet. These three different markets will „ 

20 

soon converge as cell phones and handheld computers will 
have web browsers ("mini browsers") integrated therein. To 
promote the convergence, cell phone manufacturers and 
wireless data network providers have attempted to standard- 
ize Internet content distribution with the wireless application 25 
protocol (WAP). Internet content providers have been slow 
to adopt these standards and now several major device 
manufacturers have begun to create their own proprietary 
standards. 

Two factors hinder the extension of the Web and is content 30 
from the personal computer (PC) environment with fairly 
standard display formats to the non-PC based information 
appliances and devices. First, re -purposing and converting 
existing PC-centric HTML web sites to the new breed of 
information appliances with drastically varying screen sizes 35 
is very problematic. For example, it is not appropriate to put 
the content of a PC-centric web site onto a small smart 
phone screen linearly. An intelligent navigation scheme that 
automatically converts content intended for the PC into 
content applicable to one or more different information 40 
appliances is needed. This intelligent navigation scheme 
would optimally vary its output based on the screen size of 
the particular information appliance. Second, most current 
wireless content delivery solutions demand adherence to 
proprietary browsers, proprietary mark-up languages and/or 45 
proprietary protocols. 

As FIG. 1 illustrates, there are currently multiple different 
mark-up languages 2, multiple different protocols 3 and 
different browsers 4. For example, Phone.com has intro- 
duced both the HDML and the WML protocols for cellular 50 
telephones in the United States, whereas Japan has adopted 
the I -mode protocol. Palm Pilot devices support a variant of 
the HTML protocol that uses web-clipping, while Windows 
CE devices support only a limited HTML protocol using 
special software such as Pocket Explorer. To establish an 55 
effective wireless presence, a company with content must 
support the multitude of different information appliances 5, 
the different protocols 3, the different markup languages 2 
and the different browsers 4. 

The number of devices, protocols and mark-up languages 60 
create a large matrix of different combinations of devices, 
protocols and languages wherein each combination requires 
a different web server. The rewriting of a site for each 
mark-up language and for interfacing with each different 
protocol and screen size is expensive, complicated and time 65 
consuming. In addition, because each different device may 
have a different input/output format, such as a different 
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screen size, the presentation of information on any one 
device is not optimized, for example, for its screen size due 
to the variety of formatting alternatives. 

Most of the prior approaches to wireless content delivery 
have involved linking a certain browser with a certain 
protocol and a certain mark-up language. There is not a 
single standard that is pervasive throughout the multitude of 
wireless handheld devices and information appliances. As a 
consequence of this lack of a standard, content providers are 
forced to re-author and re -format their web pages in order to 
generate content for each of these devices. 

Compatible languages, such as Extensible Markup Lan- 
guage (XML), a software language designed especially for 
Web documents, have become much more mature and 
permit re-formatting of HTML or XML web pages on-the- 
fly to formats that individual devices can utilize. However, 
none of these conventional systems and solutions provide a 
single unified system that permits web pages having differ- 
ent formats and mark-up languages to be delivered to 
different information appliances that may use different 
protocols, different browsers, or have different input/output 
formats (e.g., different screen sizes). Thus, none of the 
conventional systems provide an intelligent navigation sys- 
tem wherein content may be delivered in a customized 
manner to the different information appliances. 

Another problem with the conventional systems is that 
content providers have not been able to control the "look and 
feel" of their site using these other solutions so that the site 
may look very different on different devices. It is desirable, 
however, to provide a system that will allow these sites to 
customize the presentation of their site web pages to the 
wide variety of information appliances. Thus, it is desirable 
to provide a content delivery system and method that solves 
the above limitations and problems with the conventional 
systems and it is to this end that the present invention is 
directed. 

SUMMARY OF THE INVENTION 

The content delivery system and method in accordance 
with the invention solves the above problems and limitations 
with conventional systems and solutions by providing a 
system and method that delivers Web-based content, 
commerce, enabling transactions, and services to a variety of 
information appliances and devices without requiring the 
re-authoring of the content information for display on each 
of these different devices. 

In accordance with the invention, the system and method 
permits content to be input into the system in a variety of 
different formatting languages. In addition, the system per- 
mits the formatted content to be output in any mark-up 
language and protocol, such as WML, HTML, HDML, 
XML, etc. Advantageously, each display page on the device 
may be customized. To organize the content for display on 
the devices, the received content information may be 
mapped into a hierarchy of groups so that the content 
information can be optimally formatted for display on the 
devices according to the input/output format, such as the 
display screen size parameters of the devices. 

In more detail, the method for content delivery may 
include intelligently harvesting content from a web page to 
provide that content to a plurality of different information 
appliances having different screen sizes. The intelligent 
harvesting may convert the content into a proprietary rela- 
tional markup language (RML) and generate a tree and then 
a document object model from the RML content. The tree 
may then be analyzed and searched using a set of processing 
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rules in order to generate content screens customized to each 
information appliance. A typical card builder may build the 
card corresponding to the customized content and a typical 
deck builder may build a deck of cards corresponding to the 
one or more display screens that make up the content for the 5 
particular information appliance. The deck of cards may 
then be converted into a presentation format and protocol for 
the particular information appliance and sent to that infor- 
mation appliance. 

BRIEF DESCRIPTION OF THE DRAWINGS 10 

FIG. 1 is a diagram illustrating examples of different 
conventional wireless content delivery systems; 

FIG. 2 is a block diagram illustrating a content delivery 
system in accordance with the invention; 

FIG. 3 is a functional diagram illustrating a content 15 
delivery system in accordance with the invention; 

FIG. 4 is a diagram illustrating the translation server of 
the content delivery system shown in FIG. 2; 

FIG. 5 is a block diagram illustrating an embodiment of 
the content delivery system of the invention. 20 

FIG. 6 is a diagrammatic view illustrating the content 
connection handler of the translation server shown in FIG. 
4; 

FIG. 7 is a diagram illustrating an intelligent harvesting 
method in accordance with the invention; 25 

FIG. 8 is a diagrammatic view illustrating the XML 
engine of the translation server shown in FIG. 4; 

FIG. 9 illustrates an example of the mapping of an HTML 
web page into an RML object in accordance with the 
invention; 30 

FIG. 10 is a diagrammatic view of the layout engine of the 
translation server shown in FIG. 4; 

FIG. 11 illustrates an example of a portion of an HTML- 
based web page from the E-TRADE website illustrating the 
grouping of different elements of the web page in accor- 
dance with the invention; 

FIG. 12 is a diagrammatic view of a data structure tree for 
ordering the different groups and atomics of the E-TRADE 
web page shown in FIG. 11; 4(J 

FIG. 13 is a diagram illustrating a recursive tree analysis 
in accordance with a preferred embodiment of the invention; 

FIG. 14 is a diagrammatic view of the tree structure 
shown in FIG. 12 illustrating a collapsing methodology for 
processing the tree in order to create cards in accordance 45 
with the invention; 

FIG. 15 is a diagrammatic view illustrating an example of 
card formats that are created in processing the tree structure 
shown in FIG. 12; 

FIG. 16 is an example of a portion of an HTML-based so 
web page from the CitySearch.com website illustrating the 
grouping of different elements of the web page in accor- 
dance with the invention; 

FIG. 17 is a diagrammatic view of a data structure tree for 
ordering the different groupings and atomics of the City- 55 
Search.com web page shown in FIG. 16; 

FIG. 18A is an example of a screen shot of a Palm Pilot 
device showing a presentation page of the CitySearch.com 
website shown in FIG. 16; and 

FIG. 18B is an example of a series of screen shots of a 60 
cellular telephone device showing a presentation page of the 
CitySearch.com website shown in FIG. 16. 

DETAILED DESCRIPTION OF A PREFERRED 

EMBODIMENT 65 

The invention is particularly applicable to system and 
method for delivering web content to a variety of different 
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information appliances having different display formats and 
different screen sizes and it is in this context that the 
invention will be described. It will be appreciated, however, 
that the system and method in accordance with the invention 
has greater utility such as to systems for delivering other 
types of content to devices having different input/output 
formats. 

The content delivery system in accordance with the 
invention provides many advantages over the conventional 
systems. As described below, the system permits content in 
a variety of different formats, such as HTML, XML, raw 
data, etc., to be input into the system and then permits the 
content to be output in a variety of different output formats 
and protocols, such as WML, HTML, HDML, XML, etc so 
that the same incoming content may be displayed on many 
different information appliances and devices having differ- 
ent screen sizes. In a preferred embodiment, the system may 
use an XML data structure and the system may include one 
or more different software applications that may be based on 
the JAVA language. 

In accordance with the invention, content providers do not 
have to change their existing infrastructure or use multiple 
different web servers to send information to various infor- 
mation appliances. In particular, the system receives incom- 
ing content on-the-fly from an Internet content provider 
thereby allowing for dynamic information generation. In 
more detail, the system in accordance with the invention 
manipulates standard web pages into a relational markup 
language (RML) that permits, for each information 
appliance, a "set" of pages that is more useful to the 
individual device and dependant on the device. Now, the 
content delivery system in accordance with the invention 
will be described. 

FIG. 2 is a diagram illustrating a content delivery system 
10 that may include a translation system 12 in accordance 
with the invention. The translation system 12 may be any 
computer system, such as a server or workstation, with 
sufficient computing power to handle the functions being 
performed as described below. In a preferred embodiment, 
the translation system 12 may be a server computer that 
stores one or more different software applications that may 
be executed by the CPU of the server in order to implement 
the functions of the content delivery system described 
below. The translation server 12 may allow content provid- 
ers 13 to deliver their content (in different formats as shown) 
to one or more different information appliances 15 without 
needing to reformat, re-author or rebuild an existing web site 
in order to deliver it to the different information appliances 
15 using different communications formats as shown. It is 
desirable to provide an upwardly scalable robust server 12 
so that software can be coded without requiring updates as 
server loads increase. Rather, such increased server loads 
may be handled by adding additional hardware components, 
for example memory, to the server. It is also desirable that 
the server 12 be platform independent to support any oper- 
ating system environment, such as UNIX, Windows, Macin- 
tosh and any other operating system. 

In more detail, the translation server 12 may take infor- 
mation directly from an Internet content provider's web site 
in various forms, such as HTML data, XML data, or raw data 
feeds and then re-deliver it, via the translation server 12 and 
through a telecommunications system 14, such as, a wireless 
carrier base station that uses a typical communications 
format such as CDPD, to information appliances 15 in a 
format that is completely customized to the end user's 
device type and browsing capabilities. Thus, the content 
delivery system and method may generate and output WML, 
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HDML, tiny HTML, compact HTML* HTML or XML data provides the tree structure needed by the layout engine as 

that is compatible with the particular information appliance described below during the tree analysis functions. 

15. The information appliances 15 may be any type of device Generally, a tree data structure is a method for representing 

including WAP compliant cell phones, Windows CE a hierarchy of data using tree diagrams formed from nodes 

devices, Palm OS devices, and any other HTML browser 5 and line segments between the nodes. This may be a bit 

based devices. confusing because the DOM's tree structure may be used for 

For each wireless device 15, there may be a separate both its intended purpose of storing the HTML markup 

telecom system 14, such as different gateways or proxies to contained in atomics, but also as a way of storing relational 

the different information appliances 15. Thus, in accordance information about those atomics as described below in more 

with the invention, the content provider 13 may create a 10 ^ cta ^* 

single piece of content in a single format and the telecom The tree analysis function 30 may receive the DOM and 

system 14 does not have to be modified in order to provide & CQ dynamically generate pages for a variety of different 

the wireless content in accordance with the invention to each *f rgct * CTeen sizes - J t n ™ ore t deun * * known 

separate wireless device 15. In addition, each Internet con- ?*? ^rch algorithm, such as that described in 

tent provider 13 also has the ability to customize the 15 £ e <*<~ Dicl ^ary -of Computer Science, Engineering and 

r f(L . ... • a. j 1 Technology, is used to recurse through the atomics and 

appearance of their web site pages using the system s web 9/' .. t , . , \ .,, r 

• J /">iiT » 1 tl /■'in » 1 it u a groups with complicated rules as described below for opti- 

based GUI tool. The GUI tool allows a web page producer ^ fiUm F s ^ contem while creati an intelli * nl 

or developer to create a set of rules that are used by the navi ^ ation % h * mc m accordancc wi th ^ invention. The 

translation server 12 to describe how a particular web page Card Builder funclion 32 may apply the protocol specific 

is to be translated by the system 10. In particular, the "drag ^ syntax needed by me part icular target device and output a 

and drop" functionality of the GUI tool in a toolkit that presentation card 34. In accordance with the WAP protocol, 

allows a "producer" level employee to tailor the "look and the Deck Builder 36 is needed to put the cards into decks and 

feel" of the site so that, when the web site is delivered to a into a presentation shoe 38 to optimize transmission latency 

wireless device 15 via the system 10, the same "look mad through wireless networks. The system also permits the 

feel" can be delivered as well. Now, more details of the 2 s intelligent interaction of the information appliance with the 

wireless content delivery system will be described. web site since the system may enable functionality not 

FIG. 3 is a functional diagram of the content delivery typically supported by the information appliance via the 

system and in particular the translation server 12 in accor- virtual browser. The functionality of the content delivery 

dance with the invention. In the example shown, the content system will be described in more detail below. Now, a 

20 may be an HTML web page. The translation server 12 30 preferred hardware implementation of the content delivery 

may include an intelligent harvester 22, a tree synthesizer system in accordance with the invention will be described. 

26, a tree analyzer 30, a card builder 32 and a deck builder FIG. 4 is diagram illustrating a preferred implementation 

38 that generate a presentation shoe 38 that may be sent to of the translation server 12 of the content delivery system 10 

a particular information appliance 15. To intelligently har- in accordance with the invention. In particular, the server 12 

vest an HTML web page not only involves grabbing the 35 may include a content connection handler 40, a layout 

content on the site (scraping), but also allows any function- engine 42, an appliance connection handler 44 and an XML 

ality on that site to be enabled on the target information engine 46. The translation server 12 may also include a 

appliance or device. This enabled functionality may include, database 47 that may contain XSL rules used by the XML 

for example, forms, transactions, javascript, cookies, session engine 46 for converting XHTML pages into RML, one or 

data and security measures. This enabled functionality is 40 more URL Ids and various device information. In accor- 

possible due to the virtual browser (See FIG. 7) that dance with the invention, each XSL rule may be indexed in 

provides, for example, javascript and cookie proxy engines, the database based on an ID (the ID may contain a URL, a 

so that an information appliance that cannot support javas- name/value pair and cookie information) so that the system 

cript may do so with the javascript being eexecuted on the may determine which rule applies to which incoming URL. 

translation server. As another example, an information appli- 45 The device information may be used by the layout engine 42 

ance may not support persistent session that may be required in order to convert the RML data into one or more cards in 

by the web site, but the virtual browser may enable a a deck that may be displayed on the particular device. For 

persistent session so that the web site thinks that it is example, the information may indicate the amount of infor- 

interacting with an information appliance that has the per- mation that may be fit on each screen. The translation server 

sistent session capability. This process also involves apply- 50 12 may also include a long term database 48 that may 

ing a set of rules describing the relational context of the contain cookies so that the system knows which pages have 

content (e.g., how a piece of, for example HTML code, been processed previously and a session database 50 that 

relate to each other). stores a presentation shoe for each device. 

In more detail, the intelligent harvester 22 may receive the Web page requests made by an information appliance 15 

content and generate a relational data structure 24 that 55 to an Internet content provider 13 may occur using, for 

corresponds to the content as described below in more detail. example the Wireless Application Protocol (WAP), a wire- 

The data structure containing the content in a relational less data transmission standard described in the WAP 1.2 

format in accordance with a preferred embodiment of the Specification Suite located at http://www.wapforum.org, 

invention is a proprietary relational markup language known incorporated herein by reference. Generally, in accordance 

as RML. RML is an XML based language which has the 60 with the WAP protocol, an information appliance 15 may 

advantage of permitting the easy mapping of the content into initiate a request in WML (Wireless Markup Language), a 

a tree structure by the tree synthesizer 26 so that the tree language derived from XML especially for wireless network 

synthesizer may output a typical document object model characteristics. This request is passed to a WAP gateway that 

(DOM) 28. The DOM is a common object model used to then retrieves the information from the Internet. The 

manipulate markup such as HTML such as it disclosed on 65 requested information is then sent from the WAP gateway to 

the W3C web site at http://www.wc3.org. Although it is the WAP client (the device 15), using whatever mobile 

typically used for manipulating HTML or XML, it also to network bearer service is available and most appropriate. 
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The data transmitted over the telecommunications net- 
work 14 (See FIG. 2) is represented as a group of data 
packets and electronically routed over the telecommunica- 
tions network 14 to an appropriate destination node, such as 
an appropriate information appliance 15. The packets of data 
include header information that describes the information 
contained in the data packets, such as the type of data 
contained in the packet, i.e. HTML, voice, ASCII, etc., and 
origination and destination node information, and is used by 
the system 10 to route and configure the data packets for 
transmission through the wireless network 14, according to 
well known network routing techniques. 

In accordance with the invention, when an information 
appliance 15 requests a web page from the content provider 
13, the content provider 13 examines the header information 
contained in the data packets and redirects any non-PC 
requests to the translation server 12. In accordance with the 
invention, content from an Internet content provider 13 is 
received by the content connection handler 40. The content 
connection handler 40 mimics a standard HTML browser, 
such as Internet Explorer or Netscape, and functions as the 
interface with a content provider's web site. Non-PC 
requests are redirected to the translation server 12 so that the 
web page information can be translated into a data format 
appropriate for and recognizable by the destination infor- 
mation appliance 15. Desktop PC requests do not need to be 
redirected to the translation server 12 since these devices are 
able to display web pages in standard HTML or other similar 
formats and do not require customized web page informa- 
tion. 

The appliance connection handler 44 operates as a Web 
server for a requesting information appliance 15. The appli- 
ance connection handler 44 brokers and controls the entire 
transaction between the requesting device 15 and the trans- 
lation server 12. For example, the appliance connection 
handler 44 may handle the session with each telecom system 
14 or wireless device 15 and may perform various functions 
such as establish a session with the information appliance or 
wireless device 15, retrieve page information from the 
content connection handler 40, translate received pages 
using the XML engine 46 and the layout engine 42 as 
described below, and transmit the translated page informa- 
tion to a requesting information device 15 with appropriate 
header information. The appliance connection handler 44 
may also determine state information of the devices 15, 
synchronize the devices 15, determine browser and protocol 
information and perform various security operations. 

The XML engine 46 converts the XHTML page, gener- 
ated by the content connection handler 40, to a proprietary 
markup language, RML — the Relational Markup Language, 
using a rule-set that may be stored in a database 47. RML is 
a markup language written in XML. Each page-type may 
include 1) a XSL rule-set that specifies what pieces of 
content to display on vary appliances as well as the relational 
structure amongst those pieces of content; and 2) a perl 
script. XSL is a language for transforming an XML docu- 
ment into another XML document as described at the W3C 
website located at http://www. w3c.org. The XSL rulesets 
used by the XML engine 46 may be stored in a database 47 
indexed by ID page-type identifiers. The XSL rule-set may 
be used to process hierarchical elements of the content while 
the perl script may be used to process unstructured markup 
in the content. In accordance with the invention, a unstruc- 
tured tag may be used to mark the markup that may be 
processed using the perl script. Each page-type has a rule-set 
and possibly a perl script associated with it. In accordance 
with the invention, the XSL rule-set and the perl script may 
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be generated manually or may be automatically generated. 
The information stored in the database 47 can be used by the 
XML engine 46 when converting page information. 

The layout engine 42 processes the content to convert the 
relational RML content, received from the XML engine 46, 
into device and protocol specific mark-up language formats. 
Formatted output provided by the layout engine 42 will be 
referred to herein as a "presentation shoe." For purposes of 
this description, a presentation shoe consists of "decks" and 
"cards." A deck is the smallest unit of content that is sent to 
a device. In certain protocols, such as WML and HDML, a 
deck can contain multiple cards. The presentation shoe thus 
contains the original HTML content of a webpage that is 
reformatted into an appropriate format language and tar- 
geted at an information appliance 15. The presentation shoe 
is formatted specifically for the target appliance's screen 
size, user interface and protocol. Advantageously, the pre- 
sentation shoe is created dynamically in accordance with 
relational information in the RML data, which will be 
described in detail herein. Information about the presenta- 
tion shoe can be stored in a session database 50 that may be 
in communication with the layout engine 42. Now, the 
details of the operation of the translation server will be 
described. 

FIG. 5 is a block diagram illustrating operation of the 
wireless content delivery system 10 of the invention. As 
described above and shown in FIG. 5, an information 
appliance 15 may request page information of a particular 
URL website from a content provider 13. The request by the 
information appliance 15 may be redirected, for example, 
via a JAVA servlet 60, to the translation server 12. The 
appliance connection handler 44 examines header informa- 
tion from the requesting data in order to determine a target 
device 15, protocol and browser configuration. The appli- 
ance connection handler 44 then requests the desired URL 
information from a content connection handler 40. The 
appliance connection handler 44 may relay information 
about the requesting wireless device 15 to the content 
connection handler 40 such as a requested URL address of 
a content provider 13, a session ID, a user ID and a wireless 
device's browser capabilities. The content connection han- 
dler 40 retrieves the requested information from the content 
provider 13, renders any JavaScript, stores any cookies and 
returns the requested information as XHTML data to the 
appliance connection handler 44. 

The appliance connection handler 44 then requests the 
XML engine 46 to convert the received XHTML data to 
RML data and assign atomics into the relational structure 
according to page type so that presentation cards can be 
created and placed in a presentation shoe so that the cards 
can be transmitted to the target device 15. The XML engine 
46 will be described in more detail below with reference to 
FIGS. 7-9. 

The application connection handler 44 then conveys the 
RML data output from the XML engine and device infor- 
mation (stored in the database 47 shown in FIG. 4) to the 
layout engine 42 so that the layout engine 42 can generate 
a device and protocol specific set of cards that are served to 
the requesting appliance 15 by the appliance connection 
handler 44, via the presentation shoe. 

In more detail, the layout engine 42 may include a layout 
processor 62, a preprocessor 64, a recursenode module 66, 
a card formatter 68, a deckbuilder 70, a contentcutter 72, a 
ordernodes module 74, a guidehandler 76, a recursenode and 
recurse atomic module 78, 80, a navigation builder 82 and 
a shoebuilder 84. The layout processor 62 may receive the 
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XML DOM, the device information from the database 47 
and the shoelD and may forward the shoe information to a 
preprocessor 64. The preprocessor may forward the XML 
DOM and class information to a content cutter 72 that 
operates on the RML data to remove any classes of infor- 
mation that cannot be displayed on the particular informa- 
tion appliance or wireless device 15 based on the device 
information from the database 47. Once the classes are 
removed, an ordered structure of atomics is generated from 
the RML data by the OrderNodes module 74 so that, for 
example, a tree structure describing the relation of different 
portions of the content information to be displayed on the 
device 15, can be created. A GuideHandler module 76 
permits the system to put a template on each card of the first 
card in a shoe delivered to an information appliance. For 
example, the logo of a company may be displayed in the 
comer of each page displayed to the information appliance. 
The navigation builder 82 may permit links between cards 
and other items to be placed on a card. For example, a 
particular web page may always have one or more naviga- 
tion links on each page (e.g., support, products, etc.) that 
may be placed on each card by the navigation builder. 

The RecurseNode module 66 and 78 may then operate on 
the tree structure to group related content information so that 
appropriate cards can be created for presentation on the 
wireless device 15. The recursion of the nodes in the tree is 
described in more detail below with reference to FIG. 14. In 
processing the tree structure, the card formatter module 68 
may generate presentation cards that include the appropri- 
ately grouped atomics as determined by the recursenodes 
and recurseatomic modules 66, 78, 80. The deck builder 
module 70 may then groups the plurality of presentation 
cards into decks so that the deck can be organized into an 
appropriate presentation shoe to optimize the wireless link 
between the system and the information appliance. To 
perform the above function, the deckbuilder 70 may include 
the navigation builder 82 that builds the navigation path 
between the cards in the deck and the shoebuilder 84 that 
actually builds the presentation shoe. 

Once the presentation shoe is completed by the layout 
engine 42, it returns the presentation shoe to the appliance 
connection handler 44. The appliance connection handler 44 
then may relay the presentation shoe, for example via the 
JAVA servlet 60 to the appropriate requesting wireless 
device 15 so that the content information from the content 
provider 13 can be displayed in a format appropriate for the 
wireless device 15. 

FIG. 6 is a diagrammatic view illustrating the function of 
the content connection handler 40. The content connection 
handler 40 establishes an on-line session with a web site in 
order to receive webpage content to be converted. In a 
preferred embodiment, HTML pages are retrieved from a 
specified URL website and the retrieved HTML information 
is formatted as XHTML, an XML compliant HTML format 
utilized by the translation server 12 and represented by a 
document object model (DOM). A DOM is a common object 
model used to manipulate markup. This structure is used to 
translate the HTML page information to a format that can be 
translated by the XML engine 46 to be described in detail 
below. The content connection handler 40 may communicate 
with a long-term database 48 (See FIG. 4) that is configured 
to store cookie information about the web sites from which 
page information is retrieved. 

As described above, the content connection handler 40 
functions as the interface with a content provider's website. 
In a preferred embodiment, it acts as a virtual browser, 
proxying browser functionality on behalf of the information 
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appliance. This function is performed by an HTTP client 
connection module 90. The HTTP client connection module 
90 functions to maintain a communication session with the 
content provider 13 and handle cookie and security (SSL) 

5 information. An HTML data processor 92 may render Java- 
Script data in a proxy fashion for the requesting information 
appliance 15. With a scripting language, such as JavaScript, 
the client-side source code is embedded directly into the 
HTML page and a client-side software plug-in that interprets 

10 that language is automatically activated while the HTML 
page is being displayed. This client-side software is not 
available on most information appliances, requiring our 
servers to proxy this function. An HTML cleanup module 94 
may generate XHTML page information so that the trans- 

15 lation server 12 can translate the received HTML page 
information to a format readable by and appropriate for an 
information appliance 15. Now, a method for intelligent 
harvesting in accordance with the invention will be 
described that includes some of the elements described 

2Q above. 

FIG. 7 is a diagram illustrating a method for intelligent 
harvesting of web pages in accordance with the invention 
that uses the elements shown in FIGS. 3 and 4. In particular, 
the content 20 is fed into a virtual browser 100 as described 

25 briefly above which passes on the content to the XML 
engine 46. Using the particular page -type rules sets stored in 
the database 47, the XML engine 46 may convert the content 
into relational markup language (RML) format 24. In more 
detail, a session on the Internet is handled by the virtual 

30 browser 100 located on the translation server 12. This virtual 
browser provides the important functionality of proxying 
javascript (using a Javascript proxy engine 102) and cookies 
(using a cookie proxy engine 104) for the target devices. For 
each page-type, an XSL rule-set is stored (See the W3C 

35 specification at http://www.w3.org/TR/2000/WD-xsL- 
20000112) in the database 47. This rule-set contains infor- 
mation about the division of content into atomics, content 
classification, and the relationships amongst these atomics 
for the particular page. The XML Engine 46 uses these XSL 

40 rule-sets to translate the HTML into RML. The division of 
content into atomics can be generated either automatically or 
by a producer using a tool set. The content classification 
information is used to determine what content and function- 
ality are appropriate for which classes of devices. The 

45 content classification information may be generated either 
manually or automatically. Now, more details of the XML 
engine in accordance with the invention will be described. 

FIG. 8 is a diagrammatic view illustrating more details of 
the XML engine 46. As described above, the XML engine 46 

50 extracts content from dynamically changing XHTML infor- 
mation and generates a corresponding file, for example, an 
RML file, in accordance with predetermined rulesets. XSL 
rulesets define the transformation algorithms used to convert 
between formats, such as between XHTML and RML. An 

55 example of an XHTML document being converted into 
RML using an XSL rulesets is described below. 

In operation, the XML engine 46 may receive a page-type 
designated by URL, name/value pairs, and cookie informa- 
tion and pages of XHTML information from the content 

60 connection handler 40. A URL/Rule hashtable module 112 
may receive certain XSL rulesets from the database 47 that 
define how the information from the URL website is to be 
converted. Different XSL rulesets may be used depending on 
the format of the particular URL website from which origj- 

65 nal HTML page information has been received. In conjunc- 
tion with the rulesets determined by the hashtable 112, an 
XSL transform processor 110 may convert the received 
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XHTML, information to RML information that is provided 
to the layout engine 42 so that it can be converted into device 
and protocol specific mark-up language formats. 
Additionally, the XSL rulescts may permit the Internet 
content provider 13 to control the look and feel of the 
content (e.g., the location and/or inclusion of certain ele- 
ments of the content together on a single card) regardless of 
the wireless device 15 on which the content is being dis- 
played. Now, an example of an HTML page prior to 
conversion, an XSL ruleset that may be used to convert the 
particular HTML page and the resulting RML page in 
accordance with the invention will be described. 

Below is an example of an HTML web page that may be 
converted in accordance with the invention into RML. 



SAMPLE HTML CODE FOR WEB PAGE 

<htmi> 

<acad? 20 

<titlc>Wclcomc to Foo.com! </titlc> 
c/head> 
<body> 

<p><a hrcf-"http:/A»rww.foo.com/login/">Log In</a><yp> 
<p><a href»"http:/Avww,foo.com/sigiiup/">sign Up</a></p> 
<hl align«**ccDier">Foo.com</hl> 25 
<h4 align«"center">Your source for all things foo!</h4> 
<table> 
<tr> 

<th co!span="2" align»"center">Foo Products </th> 

</tr> 

<tr> 30 
<td>Foo Fighter</td> 
<td>$19.95</td> 
</tr> 
<tr> 

<td>Foo Peacemaker <Jtd> 
<td>$29.95«/td> 

</tr> 
<tr> 

<td colspan-"2"> 

<a href-"http:www.foo.com/buy">Buy these 

wonderful Foo*s!^/a> 

<Jld> 

<ftt> 
</tab\c> 

<p>(c) 2000 Foo.com </p> 
</body> 
</htm)> 
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Now, a XSL ruleset that may be used to convert the above 
HTML page into RML in accordance with the invention is 
set forth below. 



otshstylesheet xm Ins :xsl-"http://www. w3.org/1999/XSL/Transfonn"> 
<xsl :template ma tch-" * [f '><xsl :appl y- te mp lates/ > </xsl: template> 
<xsl:template match- M textQ|@*"><xsl:vahie-of select-".'7></xsl:tem- 
plate> 

<xshtemplate match-"html"> 
<rml> 

<hcad> 

<title>ccsl:value-of select— "//titIc T Y></titlc> 
</head> 
<guide> 

<navigation> 
<pane> 

<xsl;for-cach select*>"body/p/a"> 

<atomic name-"Goto:'* class-"!" 

co lumn»"colum n" > 

<a href="{@href}"><xsl:value-of 

select="normaIize-space(.)*7> </a> 

</atomic> 
</xsl:for-each> 
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</pane> 
</navigaUon> 
</guidc> 

cgioup name»"Main" class-"!" sequential -"sequential*^ 

ocsl: apply- templates sclect-"body7> 
</group> 
</im\> 
</xBl:lemplatc> 
ocsl: template malch-"body"> 

<atomic aame="Tltle" dass=**l"> 

<b><xsl:value-of select-*li47></b> 
</atomio 

<csl:apply-templates select*»"table'7> 
</xsl:template> 
cxsl: template match="table"> 

<group name -"Toy Table" dass="l"> 

catomic name=Table Title" class=**l" sequential =" sequential" > 

<b><xsl rvalue -of select="tr/th'7></b> 
</atomio 

<3csl:for-each seIecU"trf^not(@colspan)]]"> 
<group name-Toy** class-" 1"> 
<xsl:for-each select« N td"> 
<atomic name-**Entry"> 

<xsl :value-of select- a .7> 

<7atomic> 
<tfjtsl:for-each> 
</group> 
</xsl:for-each> 
</group> 
<^tsl:template> 
</x5l:styleshcct> 



The resulting RML code in accordance with the invention 
is set forth below: 



<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transi- 
tions l//EiV> 
<rml> 
<head> 

<title> Wei come to Foo.com! </title> 
</hcad> 
<guide> 

<navigation> 
<pane> 

<atomic name»"Goto:" class^"!" column="column'*> 

<a href-"http://www.foo.com/login/">Log In</a> 
</atomio 

<atomic name»"Goto:" class-" 1" column="columa"> 

<a href-"hUp://www.foo.com/signirp/">Sign Up</a> 
</atomio 
</pane> 
</navigation> 
</guide> 

<group name="Main" class="l" sequential= t, sequentiar r > 
^atomic name="Title "class="l"> 

<b>Your source for all things foo!</b> 
</atomio 

< group name** 4 * Toy Table" class-" !"> 

•catomic name="Table Title" class-"!" sequential= M sequential"> 

<b>Foo Products</b> 
<^atomic> 

<group name-'Toy Table" class-" 1"> 

<atomic name- M Entry">Foo Fighter</atomic> 
<atomic name-"Entry**>$19.95</atomic> 

</group> 

< group name-Toy" class-** 1"> 

<atomic name-"Entry">Foo Peacemaker</atomic> 
<atomic name-"Entry">$29.95</atomic> 
</group> 
</group> 
</group> 
<yrml> 



The above RML code may also include the unstructured 
tag as described above that m surrounds a particular portion 
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of the mark up language that is unstructured and may be 171, 172, 173 make up the root group 170. These groups 

processed using a perl script. Now, a graphical example of constitute the relational hierarchy for this portion of the 

HTML content being converted into the RML in accordance E-TRADE website. 

with the invention will be described. To display this website on a display device 15, such as a 

FIG. 9 is a diagram illustrating an HTML page 120 being 5 Palm Pilot or a Windows CE device, the groups and atomics 

converted into an RML page 122 in accordance with the need t0 be organized and placed on cards that make up the 

invention. For purposes of the conversion, each piece of PrcscntaUon shoe Cards are created by examming how 

discrete content with presentation structure or no structure &™P s f b f f l fil <f ° * e f ca K rds * A A ^ da * a * tn f ure ca ». bc 

+ . \ . . „.„ , Ti , generated from the RML object. As described above, nesting 

124 is mapped into an atomic 126 in the RML code. It should & , ., . . . ' . „ <• t t . • ^ 

u * ji *iT * *l. l. vi -*u- - c . * m groups describe the relational context of content contained 

be noted that the hyperlinks within a piece of content are 10 ? r , T „ c , t . „ . 

. , . iL / r . t , iL . . m a webpage. In RML formatting, each group typically has 

maintamed m the atomics. In addition to the atomics, groups . ° „ . f ,° . « 

c * 4 no • ,u i_ ' , . . three primary attributes: name, class and prionty. Tne name 

of content 128 in the HTML pace may be converted into a „ \ . J , , . . V. f. , t 

«m • nwt i i 4 l * attribute is used dunng a navigation pass through the tree as 

group 130 in RML. In this example, the group is non- ,. , . • . ,r . r j , ^, , , . , 

, , iL 4 * • .i_ j . j. u a link description to the group if needed. The class attribute 

sequential (e.g., the content in the group does not need to be , , t S * . v» * j . .i_ . * * 

j * i ^ ** ii \ it i fl . • • c n is used by the content cutter 72 to determine the content in 

displayed sequentially). Now, an example of the division of 15 ' iL 4 . . 4 f , 

l * i i_ • ' . ■ j • j the webpage that is appropriate for a specific devices 

a graphical web page into atomics and groups in accordance m. ,u f **_-v. * n j-<t *i t 

r.f. . j.u i . • ^ -ml j u j capabilities. Thus, the class attribute allows different levels 

with the invention and the layout engine 42 will be described r . *. u * a « j<r < t *j c 

* • i / 0 f content to be presented to different classes of devices. For 

in more c . example, the general classes of devices are shown in the 

FIG. 10 is a diagram of the layout engine 42 in accordance following table, but the number of classes may be increased 

with the invention. As described above, the layout engine 42 0 r decreased. 

formats a content source for a specific device's screen and 

inherent capabilities. The layout engine 42 may include the 

content cutter 72, the layout processor 62, the card builder 

68, and the deck builder 70, The content cutter 72 cuts all the Class General Device class 

content of format and content classes not appropriate for the 25 " ~ . . 

, . _ . . . 1 Cellular telephones - veiy limited screen size 

specific device from the received HTML page to create an 2 Palm Pilot _ low ICS0 i uticm> black an d white or monochrome 

XML representation of the received original webpage. The display 

layout processor 62, using prior knowledge of the device 3 windows CE - high resolution, color display 
type and the content, dynamically devises an optimal layout 

and navigation structure for the particular device 15. Thus, 30 Tfac {qA aUribute indicates thc imp ortance of each 

the output of the layout engine 42 is a presentation shoe in Uon of tfae C0Qtenl and ^ m&d by the u t generalor 42 

the appropriate presentation protocol for a particular apph- dwing preprocessing t0 order ^ int0 appropriate 

ance 15 * groups. To further organize the groups, each group also has 

Presentation shoes are built from the bottom-up, resulting 35 Boolean attributes, such as columnize, sequential and other 

in the highest priority "atomic" being placed on a card first. attributed such as association. The association attribute may 

An atomic is the smallest unit of a web page that encapsu- have the values of keep together (keep the atomics together), 

lates an idea. For example, an atomic may be a paragraph of isolate (isolate the atomic on its own card) or null. The 

text, a heading, a link to a news story, a picture, etc. Atomics columnize attribute indicates whether an atomic/group 

may be grouped together to reveal relationships between ^ Q should be placed adjacent another atomic/group on a card, 

them. Groups may be nested to form a complex relational The sequential attribute specifies how the navigation struc- 

hierarchy. These groups can be placed on cards so that t ure should be designed. For example, atomics containing 

customized presentation pages can be transmitted to a device paragraphs of a story would be grouped as sequential 

15. information. In contrast, a list of links to news stories would 

In operation, an RML document is received by the layout 45 be grouped as non-sequential information. The keep 

engine 42 from which the content cutter 72 cuts data classes together attribute indicates to the layout engine 42 to keep 

that are not appropriate for the requesting device 15 to children of a node together on the same card if possible. The 

generate an XML document containing the cut content, isolate attribute specifies that the group is be placed onto its 

device information that specifies the target device, and own card. 

protocol information that specifies the target protocol. 50 In addition to the above, knowledge of the number of 

Card creation will be described with reference to FIGS. characters and pixels horizontally and vertically on the 

11-13. An example of a portion of an HTML web page 170 target wireless device 15 is desirable. In addition, for devices 

from the E-TRADE website is shown in FIG. 11. In the 15 that allow fonts, a font width calculation may need to be 

Figure, the innermost dashed boxes designate atomics while made. A bandwidth and screen dependent variable may be 

the boxes enclosing them constitute groups. At the top 55 assigned to determine how much content is allowed to scroll 

portion of the page 170 is a quote look-up form 171. The before a new presentation page is created vertically and a 

quote look-up form 171 is made up of three atomics, a link created to subsequent presentation pages. Another vari- 

"Quotes" title portion 171a, an entry box 1716 and a "Go" able may store the minimum container width (how narrow 

submission button 171c. Further, the market graph 172a, the text container can be made so that another container can 

table 1726, and Fool.com advertisement 172c are each 60 be put along side it). 

related atomics and are grouped together to constitute group FIG. 12 illustrates the relational structure of the groups in 

172. In addition, each element in the market graph 172a may the E-TRADE web page shown in FIG. 11 that may be 

also be an atomic so that "NASDAQ" is an atomic, represented in a tree structure. This separation of content 

"2756.27" is an atomic, the down arrow is an atomic and from style in a relational way that can be represented in a 

"-5.48" is an atomic. Finally, the TheStreet.com logo 173a, 65 tree allows for the re-formatting of the content in accordance 

and the four news stories 173b~e are each related atomics with the invention. In the Figure, a root element node 180 

and are grouped together as group 173. All of the groups refers to the outermost group of the tree and indicates the 
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E -TRADE web page 170. The nodes 182-186 (roman is passed to Recurse Alomic step 208. In Recurse Atomic, an 

numerals I, II, and III) are the "children" of the root node and atomic object is created using the node's contents. This 

refer to the atomics 171a, 1716, 171c of the Quotes portion Atomic is then placed on to the current pane in step 210. 

171 of the E-TRADE web page 170. A node 188 (roman There are circumslanceSj sucb as columnizable 

numeral IV), another chUd of the root node, refers to the 5 alomicS) whcrc thc panc b addcd t0 thc afd md iht 

market group 172 and includes children nodes A-C (market . , . . , , r A , . 

L1 f. \, jt-1 j r current atomic is placed in a new pane. A columnizable 

graph 172a. table 1726 and Fool.com advertisement 172c of . . lL 4 r , . , ,. r , . A 

FIG. 11). Similarly, a node 190 (roman numeral V) refers to »J? m,c » oa f e * at be P l f ed ad J ace u nt t0 » 0,her atomlc - 

the news section 173 and includes children nodes A-E (the ^ ^J* the cMd ™ ° f a ^"P tave P 355 * to 

news story links I73a-e of FIG. 11). Hie flexibility of trees in Nodc m ste P 2U - Recurse Grou P 206 '° 

and directed graphs allows them to represent any structure 10 add cu^nt pane to the current card in step 2U If the 

found in a web site. Tims, the trees provide a method to P™ 6 fits ° D ^ card (the test occurs in step 216) then 

organize the complicated relationships amoogst content in a * ecurse Grou P »tun»instep 218 to Recurse Group 206 and 

web page for formatting as well as provide navigation links me recursion contmues. If the pane doesn t fit on the card, 

between cards in a presentation shoe. „ "exception * bandied in Recurse Group where one of two 

^ . , • , . i_ az ■ t i 15 things can happen. If the pane contains sequential atomics 

To provide a simple tree structure that can be efficiently , ; * , . f r „ A * .. .. Vi . r 

. * ,* . . j. • . r™ - . (as tested in step 220), then the pane is split, and as many of 

navigated the tree is sorted by pnonty. Tliis pre-processing ^ ^ [ { QQ ^ J d afe ^ Q 

reorders the tree to put the content that will appear on the » , , , A r . 4 t , 

j • ic « . * iL i . j r.u . aa *u new pane and put onto the card (steps 222. A link to the next 

dev.ee 15 first in the left-most nodes of the tree. After the card ^ Iace / al lne end of me and men the card is 

tree has been sorted, the groups and content are evaluated in on « , f , , , . t ™ 

. • j lL 20 placed in the shoe and a new card is created 224. The new 

order to assign the content to one or more cards using the r .... . • . ■ . . . 

„ ... . , . A . t . c t . , ... pane with the remaining cards is put onto the new card in 

attributes described above. A depth-first search algorithm as r , , 4 . ? r «- 0 

, , . ■ . u • .l r ■ • step 226 and the recursion continues in step 228. 

described above, using a recursion technique that considers r r 

outgoing edges of a vertex before any neighbors of the ff the P ane contains nonsequential atomics (as tested in 

vertex, may be used to implement this evaluation as « step 220), then the current node is cloned (dup heated). Then, 

described below with reference to FIG. 13. As this search sequential atomic nodes that are links to the original children 

evaluates each vertex, it attempts to optimally fit the atomics are created in step 230 and added as children to the original 

onto cards node wn ^ e tne original children are removed. The cloned 

FIG. 13 is a flowchart illustrating a method 200 for °° de l^u c u Wldren ^ Group step 

recursing through the tree generated by the XML engine in 30 206 then iterates through these original children by passing 

order to generate customized panes and cards that may be ' hem t0 *f Recu * e * ode : ob J ect 202 ' If °° ««P^ns occur, 

displayed on an information appliance or wireless device 15 Recurse Group 206 then iterates through the original node s 

in accordance with the invention. The recursive method new children Tli is iteration is handled in the same manner 

starts with the root node-the top node in the tree (See FIG. *?, th f e oagmal iteration through the group s children. Once 

12). In this case, the root node is the top-level group node 35 ^ l of ] hc atomics have been placed in panes and these panes 

that encompasses all of the content that shall be placed into P la ?^ m cards > the r f ursi0D ^ ould retura t0 the root node 

cards. The goal of the recursion is to place content into cards and the rccur5lon cnds - 

in such a manner that the context of the content remains FIG - i4 illustrates a collapsing methodology for process- 
while creating an intelligent manner of accessing the content m g me ^ to create cards that can be transmitted to a device 
(an intelligent navigation scheme) for each different infer- 40 15. The recursion of the tree as described above begins at the 
mation appliance or wireless device that may have different root node and proceeds to its first child node 182 (node I), 
display capabilities. As described above, all of the content th e "Quotes" text 171a. Anew columnizable pane is created 
resides in atomic nodes and all atomic nodes are children of and atomic I (171a in FIG. 11) is added to it. This pane, as 
groups. An example of the groups and atomics are illustrated il is closed, is added to a newly created card. The recursion 
in FIG. 11. Groups can contain other groups or atomics 45 process then continues to the next node 184 (node II), the 
while atomics can only contain content, for example, entry box 1716, and adds this atomic to a new columnizable 
HTML. pane. Similarly, node III, the "Go" submission button 171c, 
FIG. 13 is an example of how the recursion method may 15 added to a new columnizable pane. These panes are passed 
be accomplished but many other implementations are pos- back to thc root nodc and a card 240 15 created that includes 
sible. Recursion through the tree representation of the web so the three P anes - 

site begins at the root node. The depth-first search proceeds The recursion process continues to node 188 (node IV), 

to the first node without any children. This is the first atomic the market group 172, and immediately to leaf node A, the 

having the highest priority. In addition to the previously market graph 172a. A new pane is opened and atomic A is 

described attributes, two additional attributes are relevant to added to it. The process continues to leaf node B, the market 

the creation of cards that are placed in the presentation shoe., 55 table 1726. Atomic B is added to the pane, as is atomic C, 

"panes" and "frames." Multiple atomics can be placed, top according to the same processing, and the recursive process 

to bottom, into a pane. Panes can be assigned to cards continues through the tree to node 190 (node V), the "News" 

adjacently (columnized) by placing them into two frames section 173. A new card 242 is created that includes atomics 

side-by-side on the card. A-C of group IV. Similarly, atomics A-E (news stories 

The recursion for the tree starts by passing the root node 60 173a-^ of FIG. 11) are placed on a third card 80c and the 

to the Recurse node object 202 which determines what kind recursive process completes. 

of node (e.g., a group or an atomic) is being manipulated in The output of the card generation is shown in FIG. 15. 

step 204. In the root node's case, the passed node is a group Each of the cards 240, 242, 244 is shown. As shown in the 

and it is passed to the Recurse Group object 206. The Figure, the first card 240 includes atomics 171a, 1716, 171c 

Recurse Group object iterates through the node's children, 65 from FIG. 11 wherein each atomic is in an individual pane, 

passing each of them back to Recurse Node 202. In the case The second card 242 includes atomics 172a, 1726, 172c 

where the node Recurse Node receives an atomic, the node from FIG. 11 all grouped as a single pane on the card 242. 
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The third card 244 includes atomics 113a-e grouped as a compatible with that of the title portion 291a so the atomic 

single pane. Whether atomics may be grouped on a card as 2916 is added to the current pane. This process continues, as 

a single pane or in individual panes or any combination described above, until the entire tree has been processed and 

thereof depends primarily on the contextual information the atomics have been placed in panes on cards, 

relating to the atomic. 5 Depending on the target device 15 screen size capability, 

Navigation amongst these cards is determined by the different presentation screens may need to be generated by 

sequential attribute of the root node. Thus, in the non- the system 10 so that the information may be viewed on the 

sequential case, a card with links to the three cards 240-244 device 15. For example, for a Palm Pilot device, the con- 

shown in FIG. 15 would be created. In contrast, for a textual information associated with node 2, the "Review of 

sequential case, navigation between the cards 240-244 10 the Week" group 291, may fit entirely on the display screen 

would begin at the first card 240 and links at the bottom of so that the associated atomics may be included on a single 

the first and second cards 240, 242 would lead to the next card. However, the "New Releases" group 292 and its 

successive card (e.g., the second card 242 and the third card associated subgroups 293-296 may not fit on the card and 

244) respectively. As mentioned above, the navigation new cards may need to be created in order to display this 

between cards can be determined from the tree. 15 information on the Palm Pilot. Since node 3, the "New 

The recursive process described above was simplified Releases" group 292, is a non-sequential group, new cards 

because each child of the root node fit on am individual for each of me subgroups 293-296 may be created and the 

page. The following description, with reference to FIGS. appropriate links may be inserted into the cards so that a 

16-18, illustrates a more complicated card creation process viewer of the information on the device can navigate 

in which the dynamic feature of the layout engine 42 are 20 between display pages. An example of the presentation 

shown. information of the CitySearch.com web page 290 shown on 

FIG- 16 illustrates an example of a web page 290 from the a Palm Pilot dcvicc * shown in F,G ' 18A ' 

CitySearch.com website. In the Figure, two different group- In contrast, for a cellular telephone device that has a 

ings are indicated. The first grouping 291 includes the limited display screen size, a different alternative of display- 

"Review of the Week" contextual information. In the group m g presentation information is possible and automatically 

291, four atomics 291a-d are shown that reference the tide generated by the system. In particular, depending on the size 

291a, text 2916, picture 291c and mini review 291d. The of me display screen, the system 10 may determine that 

second grouping 292 includes the "New Releases" contex- cards are divided at different points along the creation 

tual information. In the group 292, there are different num- process and the appropriate links inserted thereon so that a 

bers of atomics for different sub-groups. Each of the sub- viewer of the presentation information on the phone device 

groups 293-296 are grouped according to the different can navigate between the different display pages. FIG. 18B 

movies. Atomics within these subgroups represent the shows an example of the presentation information of the 

names 293a, 294a, 295a, 296a of the movies, pictures 2936, CitySearch.com web page 290 shown on a cellular tele- 

2946, 2956, 2966 of the movies and mini reviews 293c, phone device. In particular, the information displayed on the 

294c, 295c, 296c of the movies. Sub-group 293 also includes M screen display of the Palm Pilot (See FIG. 18A) are 

another atomic 293d referring to the title "New Releases." broken into one or more smaller screens wherein, for 

These groups and sub-groups constitute the relational hier- example, the user of the phone must select the "Go" button 

archy for this portion of the CitySearch.com website as t0 move down through the various titles of the movies that 

determined by the XML engine 46 in accordance with the 4Q are reviewed. 

invention using the XSL rule set and perl script for the While the foregoing has been with reference to a particu- 

p articular web page. Iar embodiment of the invention, it will be appreciated by 

The tree structure for the CitySearch.com web page 290 those skilled in the art that changes in this embodiment may 

is illustrated in FIG. 17. In the Figure, the root node 300 be made without departing from the principles and spirit of 

("1") indicates the CitySearch.com website group. Child 45 the invention and defined by the appended claims, 

nodes 302, 304 (Nodes 2 and 3) indicate the two groups What 15 claimed is: 

"Review of the Week" 291 and "New Releases" 292. The 1. A system for intelligently harvesting information from 

atomics a-Kl of node 2 refer to the title 291a, text 2916, a data for one or more different information appli- 

picture 291c and mini review 291a* of the "Review of the ances ha ving different input/output formatting capabilities, 

Week" group 291 of FIG. 16. The child nodes 306-312 of 50 comprising: 

node 3 (nodes 4-7) and their associated atomics a-c of the means for receiving web-based content information of a 

tree represent the different movie sub-groups 293-296 first input/output format; 

included in the "New Releases" group 292. means for translating the received content information 

In accordance with the invention, the layout engine 42 from the first input/output format to a different input/ 

recursively iterates through the relational tree structure, 55 output format that is recognizable by a specific device; 

dynamically building cards for an appropriate screen size of and 

a target device 15. As described above, the recursive process means for providing the translated content information to 

may incorporate a depth-first recursive algorithm to imple- the device. 

ment processing of the tree structure. For example, recursion 2. The system of claim 1 wherein the device is a wireless 
begins at node 1, the root node of the tree and proceeds to 60 device having a predetermined display screen size, 
node 2, the "Review of the Week" group 291, to search for 3. The system of claim 1 further comprising means for 
the highest priority node that does not have any associated receiving information from a device that indicates the input/ 
children. The process continues to node 2a, the title portion output format of the device so that the translated content is 
291a of the group 291. Atomic 2a is placed onto a pane and generated based on the input/output format of the device, 
the pane state is set to non columnizable. Next, the recursion 65 4. A system for translating content information from a first 
process proceeds to node 26, the text portion 2916 of the formatting language to a second formatting language, corn- 
group 291. The atomic state of the text portion 2916 is prising: 
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a host system including content information of a first 
formatting language; 

a translation server remotely connected with the host 
system, the translation server configured to receive the 
content information of a first formatting language and s 
translate the received content information from the first 
formatting language to a second formatting language; 
and 

a device remotely connected with the translation server, 
the device configured to receive the content infonna- 10 
tion of the second formatting language and display the 
content information. 

5. The system of claim 4, wherein the device is a wireless 
device having a predetermined display screen size. 

6. The system of claim 4, wherein the host system 15 
comprises an Internet web server that includes at least one 
web page thereon, the web page having a specific URL 
Internet address and including the content information of the 
first formatting language. 

7. The system of claim 4, wherein the translation server 20 
comprises means for establishing a communications session 
with the host system so that information content of the first 
formatting language can be received by the translation 
server; means for receiving the information content of the 
first formatting language; means for converting the infor- 25 
matjon content of the first formatting language to informa- 
tion content of an intermediate formatting language recog- 
nized by the translation server; means for translating the 
information content of the intermediate formatting language 

to information content information of the second formatting 30 
language; means for formatting the content information of 
the second formatting language so that the content informa- 
tion can be transmitted to and displayed on the device. 

8. The system of claim 7 wherein the information content 

of the first formatting language comprises HTML. 35 

9. The system of claim 7, wherein the intermediate 
formatting language is XHTML. 

10. The system of claim 7, wherein the translation means 
comprises an XML engine. 

11. The system of claim 10, wherein the XML engine 40 
comprises a hash table module for comparing the URL 
Internet address of the received content information with a 
group of predetermined rulesets that define a criteria for 
translating the content information to the second formatting 
language; and a transform processor configured to convert 45 
the received content information of the intermediate format- 
ting language to content information of a second interme- 
diate formatting language. 

12. The system of claim 11, wherein the second interme- 
diate formatting language comprises RML. 50 

13. The system of claim 11, wherein the group of prede- 
termined rulesets define a branding criteria for the content 
information of the second formatting language so that the 
content information of the second formatting language dis- 
played on the device is similar in appearance to that of the 55 
content information of the first formatting language received 
from the webpage. 

14. The system of claim 13, wherein the group of prede- 
termined rulesets comprise XSL rulesets defined by the host 
system. 60 

15. The system of claim 7, wherein the formatting means 
comprises a layout engine. 

16. The system of claim 15, wherein the layout engine 
comprises a content cutter for extracting portions of the 
content information of the second intermediate language that 65 
are not compatible with a display capability of the device so 
that the content information of the second formatting lan- 
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guage can be generated from the remaining content infor- 
mation of the second intermediate language; a layout pro- 
cessor for dynamically generating a layout format for the 
content information of the second formatting language that 
is optimized for the display screen size of the device; and a 
protocol processor for generating the content information of 
the second formatting language. 

17. The system of claim 16, wherein the layout format 
comprises a presentation shoe that includes at least one 
presentation card, each presentation card representing a 
display page of the device, each presentation card including 
at least a portion of the content information of the second 
formatting language, wherein subsequent presentation cards 
in the presentation shoe are linked so that the content 
information of the second formatting language can be selec- 
tively viewed on the device. 

18. The system of claim 17, wherein the presentation 
cards include at least one pane portion thereon and wherein 
the content information is separated into atomic groups of 
content information, the atomic groups of content informa- 
tion being organized and assigned to the pane portions of the 
presentation card so that formatting of the content informa- 
tion can be optimized for the display screen on the device. 

19. The system of claim 18, wherein the presentation 
cards in the presentation shoe are transmitted to the device 
so that the content information of the second formatting 
language can be displayed on the device. 

20. The system of claim 7, wherein the device formatting 
means further comprises means for receiving information 
from a device that indicates the input/output format of the 
device and means for formatting the content information of 
the second formatting language based on the received device 
information. 

21. The system of claim 4, wherein the translation server 
further comprises a content connection handler. 

22. The system of claim 21, wherein the content connec- 
tion handler comprises a client connection module config- 
ured to establish and maintain a communications session 
with the host system; a data processor configured to receive 
the content information of the first formatting language from 
the host system, a cleanup module configured to convert the 
content information of the first formatting language to 
content information of the intermediate formatting lan- 
guage; means for translating the information content of the 
intermediate formatting language to information content 
information of the second formatting language; and means 
for formatting the content information of the second for- 
matting language so that the content information can be 
transmitted to and displayed on the device. 

23. A method for intelligently harvesting information 
from a data source for one or more different information 
appliances having different input/output formatting 
capabilities, comprising: 

receiving content information of the first formatting lan- 
guage from an Internet web server; 

converting the received content information of the first 
formatting language to an intermediate formatting lan- 
guage so that the received content information can be 
processed and formatted in accordance with a display 
capability of a device; 

translating the processed content information from the 
intermediate formatting language to content informa- 
tion of the second formatting language; and 

transmitting the content information of the second for- 
matting language to the device so that the content 
information can be displayed on the device. 

24. The method of claim 23 further comprising receiving 
information from a device that indicates the input/output 
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format of the device so that the translated content is gener- 38. The method of claim 37, wherein the presentation 

ated based on the input/output format of the device. cards in the presentation shoe are transmitted to the device 

25. The method of claim 23, wherein the device is a so that the content information of the second formatting 
wireless device having a predetermined display screen size. language can be displayed on the device. 

26. The method of claim 23, wherein the receiving further 5 39. A system for translating content information from a 
comprises establishing a communications session with the fifSt formatting language to a second formatting language so 
host system so that information content of the first format- the . content information can be displayed on a device, 
ting language can be received by the translator. comprising. 

27. The method of claim 26, wherein the information a hashtable module for comparing a URL Internet address 
content of the first formatting language comprises HTML. 10 ° f a ^ eb P a § e ^hiding the content informaUon of the 

28. THe method of claim 26, wherein the intermediate fi f * aDgUag * ^ ' a S^P of predetermined 
c „. y . vuT-wT rulesets that define a criteria for translating the content 
formatting language is XHTML informaUon to an intermediate formatting language to 

29. The method of claim 23, wherein the u translation te mformation of the intermediate for- 
further comprises performing operations with an XML matting language; 

en ^ 1 / ? e ' LJ r,.m i_ L r 15 a transform processor configured to convert the received 

30. TTie method of claim 29 wherein the performing comem in £ ormation of ^intermediate formatting lan- 
operations with an XML engine further comprises compar- to content in f orraauon of a second intermediate 
ing an URL Internet address of the received content infor- formatting language to generate content in the second 
mation with a group of predetermined rulesets using a formatting intermediate language; and 

hashtable module wherein the rulesets define a criteria for 20 a lavout engine configured to translate the content infor- 

translating the content information to the second formatting mation to the second formatting language from the 

language; and converting the received content information second intermediate formatting language and to format 

of the intermediate formatting language to content informa- the content information for display on the device, 

tion of a second intermediate formatting language using a 40. The system of claim 39, wherein the layout engine 

transform processor. 25 comprises a content cutter for extracting portions of the 

31. The method of claim 30, wherein the second inter- content information of the second intermediate language that 
mediate formatting language comprises RML. are not compatible with a display capability of the device so 

32. The method of claim 30, wherein the group of that the content information of the second formatting lan- 
predetermined rulesets define a branding criteria for the guage can be generated from the remaining content infor- 
content information of the second formatting language so 30 mation of the second intermediate language; a layout pro- 
that the content information of the second formatting lan- lessor for dynamically generating a layout format for the 
guage displayed on the device is similar in appearance to content information of the second formatting language that 
that of the content information of the first formatting lan- 15 optimized for the display screen size of the device; and a 
guage received from the webpage. P rotoco1 P^ssor for generating the content informaUon of 

33. The method of claim 32, wherein the group of 35 the ^d formattmg language. 

. , , . ' T , , * j,_ a. 41. The system of claim 40, wherein the layout format 

predeterminedrulesetscompnseXSLrulesetsdeiinedbythe comprises a p reseD tation shoe that includes at least one 

^ S ^I em * . u presentation card, each presentation card representing a 

34. The method of claim 23, wherein the formatting display page of the device, each presentation card including 
comprises using a layout engine. at least a port j on of the content information of the second 

35. The method of claim 34, wherein the layout engine 40 formatting language, wherein subsequent presentation cards 
comprises extracting portions of the content information of in the presentation shoe are linked so that the content 
the second intermediate language using a content cutter that information of the second formatting language can be selec- 
are not compatible with a display capability of the device so tively viewed on the device. 

that the content information of the second formatting lan- 42. The system of claim 41, wherein the presentation 

guage can be generated from the remaining content infor- 45 cards include at least one pane portion thereon and wherein 

mation of the second intermediate language; dynamically the content information is separated into atomic groups of 

generating a layout format for the content information of the content information, the atomic groups of content informa- 

second formatting language that is optimized for the display tion being organized and assigned to the pane portions of the 

screen size of the device using a layout processor; and presentation card so that formatting of the content informa- 

generating the content information of the second formatting 50 tion can be optimized for the display screen on the device, 

language using a protocol processor. 43. The system of claim 42, wherein the presentation 

36. The method of claim 35, wherein the layout format- cards in the presentation shoe are transmitted to the device 
ting comprises a presentation shoe that includes at least one so that the content information of the second formatting 
presentation card, each presentation card representing a language can be displayed on the device. 

display page of the device, each presentation card including 55 44. The system of claim 39, wherein the group of prede- 
at least a portion of the content information of the second termined rulesets define a branding criteria for the content 
formatting language, wherein subsequent presentation cards information of the second formatting language so that the 
in the presentation shoe are linked so that the content content information of the second formatting language dis- 
information of the second formatting language can be selec- played on the device is substantially similar in appearance to 
tively viewed on the device. 60 the content information of the first formatting language 

37. The method of claim 36, wherein the presentation received from the webpage. 

cards include at least one pane portion thereon and wherein 45. The system of claim 44, wherein the group of prede- 

the content information is separated into atomic groups of termined rulesets comprise XSL rulesets defined by the host 

content information, the atomic groups of content informa- system. 

tion be ing organized and assigned to the pane portions of the 65 46. A method for intelligently harvesting information 

presentation card so that formatting of the content informa- from a data source for display on one or more different 

tion can be optimized for the display screen on the device. information appliances, comprising: 
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receiving the information from the data source in a first 
predetermined format wherein the information has pre- 
determined hierarchical relationships; 

storing the received information in a relational markup 
language to convert the received information into a s 
second predetermined format wherein the content of 
the received information is separated from the relation- 
ships between the received information; and 

outputting information from the second predetermined 
format into a final format for a particular information 10 
appliance having a particular display format 

47. A layout engine for processing incoming information 
and for generating information that is displayed on one or 
more different information appliances, comprising: 

receiving information to be distributed to the one or more 
information appliances, the received information hav- 
ing relationships embedded into the content; 

mapping the receiving information into a relational hier- 
archy based on the relationships embedded into the 
content, the relational hierarchy including one or more 
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atomics containing the content of the receiving infor- 
mation linked to each other based on the relationships 
in the received information; and 
processing the relational hierarchy based on a display 
format of a predetermined information appliance in 
order to generate a series of displays appropriate for the 
predetermined information appliance. 
48. A method for processing incoming information having 
content and relationships embedded into the content, com- 
prising: 

separating the incoming information into one or more 
pieces of content having no relationship information; 

generating an atomic for each piece of content in the 
incoming information; and 

generating a relational hierarchy connecting the atomics 
to each other in a hierarchical relationship based on the 
relationships embedded into the incoming information. 

***** 



05/04/2004, EAST Version: 1.4.1 



